Data Preparation for Data Mining

نویسنده

  • Magdi Kamel
چکیده

Practical experience of data mining has revealed that preparing data is the most time-consuming phase of any data mining project. Estimates of the amount of time and resources spent on data preparation vary from at least 60% to upward of 80% (SPSS, 2002a). In spite of this fact, not enough attention is given to this important task, thus perpetuating the idea that the core of the data mining effort is the modeling process rather than all phases of the data mining life cycle. This article presents an overview of the most important issues and considerations for preparing data for data mining.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimal Model for Medicine Preparation Using Data Mining

Introduction: Lack of financial resources and liquidity are the main problems of hospitals. Pharmacies are one of the sectors that affect the turnover of hospitals and due to lack of forecast for the use and supply of medicines, at the end of the year, encounter over-inventory, large volumes of expired medicines, and sometimes shortage of medicines. Therefore, medicine prediction using availabl...

متن کامل

An Optimal Model for Medicine Preparation Using Data Mining

Introduction: Lack of financial resources and liquidity are the main problems of hospitals. Pharmacies are one of the sectors that affect the turnover of hospitals and due to lack of forecast for the use and supply of medicines, at the end of the year, encounter over-inventory, large volumes of expired medicines, and sometimes shortage of medicines. Therefore, medicine prediction using availabl...

متن کامل

Designing a System for Trend Analysis of Users in Website Surfing in Iran Using Data Mining and Text Mining Algorithms

Background and Aim: As of the entrance of web surfing to the lifestyle of a vast majority of people in the society and the need for a more accurate social and cultural policy making in the field, authors intended to analyze the behavior of the society users in viewing different websites so as to help politicians and practitioners. Methods: Design science research method is used in this research...

متن کامل

Perform Three Data Mining Tasks with Crowdsourcing Process

For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...

متن کامل

A Reuse-based Spatial Data Preparation Framework for Data Mining

The constant increase in use of geographic data in different application domains has resulted in large amounts of data stored in spatial databases and in the desire of data mining. Many solutions for spatial data mining have been proposed. Most create data mining languages or extend existing query languages to support data mining operations. This paper presents an interoperable framework for sp...

متن کامل

Report for Data Mining Cup 2002 by :

This research report is written for attendance of DMC 2002. (See [8]) It was written following the CRISP-DM (CRISPData Mining) [1] Methodology: Business understanding, Data Understanding, Data Preparation, Modeling and Evaluation. Two popular data mining software products are used: DISCOVERER is mainly used for data preparation and modeling. WEKA is used to feature selection. At the last part o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009